Real-time audio analysis tools for Pd and MSP
نویسندگان
چکیده
Two \objects," which run under Max/MSP or Pd, do di erent kinds of real-time analysis of musical sounds. Fiddle is a monophonic or polyphonic maximum-likelihood pitch detector similar to Rabiner's, which can also be used to obtain a raw list of a signal's sinusoidal components. Bonk does a bounded-Q analysis of an incoming sound to detect onsets of percussion instruments in a way which outperforms the standard envelope following technique. The outputs of both objects appear as Max-style control messages. 1 Tools for real-time audio analysis The new real-time patchable software synthesizers have nally brought audio signal processing out of the ivory tower and into the homes of working computer musicians. Now audio can be placed at the center of real-time computer music production, and MIDI, which for a decade was the backbone of the electronic music studio, can be relegated to its appropriate role as a low-bandwidth I/O solution for keyboards and other input devices. Many other sources of control \input" can be imagined than are provided by MIDI devices. This paper, for example, explores two possibilities for deriving a control stream from an incoming audio stream. First, the sound might contain quasi-sinusoidal \partials" and we might wish to know their frequencies and amplitudes. In the case that the audio stream comes from a monophonic or polyphonic pitched instrument, we would like to be able to determine the pitch(es) and loudness(es) of the components. It's clear that we'll never have a perfect pitch detector, but the fiddle object described here does fairly well in some cases. For the many sounds which don't lend themselves to sinusoidal decomposition, we can still get useful information from the overall spectral envelope. For instance, rapid changes in the spectral envelope turn out to be a much more reliable indicator of percussive attacks than are changes in the overall power reported by a classical envelope follower. The bonk object does a bounded-Q lterbank of an incoming sound and can either output the raw analysis or detect onsets which can then be compared to a collection of known spectral templates in order to guess which of several possible kinds of attack has occurred. The fiddle and bonk objects are low tech; the algorithms would be easy to re-code in another language or for other environments from the ones considered here. Our main concern is to get predictable and acceptable behavior using easy-to-understand techniques which won't place an unacceptable computational load on a late-model computer. Some e ort was taken to make fiddle and bonk available on a variety of platforms. They run under Max/MSP (Macintosh), Pd (Wintel, SGI, Linux) and fiddle also runs under FTS (available on several platforms.) Both are distributed with source code; see http://man104nfs.ucsd.edu/~mpuckett/ for details. 2 Analysis of discrete spectra Two problems are of interest here: getting the frequencies and amplitudes of the constituent partials of a sound, and then guessing the pitch. Our program follows the ideas of [Noll 69] and [Rabiner 78]. Whereas the earlier pitch~ object reported in [Puckette 95] departs substantially from the earlier approaches, the algorithmused here adhere more closely to them. First we wish to get a list of peaks with their frequencies and amplitudes. The incoming signal is broken into segments of N samples with N a power of two typically between 256 and 2048. A new analysis is made every N=2 samples. For each analysis the N samples are zero-padded to 2N samples and a rectangular-window DFT is taken. An interesting trick reduces the computation time roughly in half for this setup; see the source code to see how this is done. If we let X[k] denote the zero-padded DFT, we can do a three-point convolution in the frequency domain to get the Hanning-windowed DFT: XH [k] = X[k]=2 (X[k + 2] +X[k 2])=4 Any of the usual criteria can be applied to identify peaks in this spectrum. We then go back to the nonwindowed spectrum to nd the peak frequency using the phase vocoder with hop 1: ! = N k + re X[k 2] X[k + 2] 2X[k] X[k 2] X[k + 2] : This is a special case of a more general formula derived in [Puckette 98]. The amplitude estimate is simply the windowed peak strength at the strongest bin, which because of the zero-padding won't di er by more than about 1 dB from the true peak strength. The phase could be obtained in the same way but we won't bother with that here. 2.1 Guessing fundamental frequencies Fundamental frequencies are guessed using a scheme somewhat suggestive of the maximum-likelihood estimator. Our \likelihood function" is a non-negative function L(f) where f is frequency. The presence of peaks at or near multiples of f increases L(f) in a way which depends on the peak's amplitude and frequency as shown:
منابع مشابه
Chaotic Signal Synthesis with Real-time Control: Solving Differential Equations in Pd, Max/msp, and Jmax
Chaotic signals are useful in two different levels in audio synthesis: as sound material or control structure. Patching languages such as Pd, Max/MSP, and jMAX provide easier mechanisms for generating chaotic structures at control level. We can generate deterministic chaotic signals either by finding numerical solutions to differential equations or by using first return maps. While generating t...
متن کاملMuBu and Friends - Assembling Tools for Content Based Real-Time Interactive Audio Processing in Max/MSP
This article reports on developments conducted in the framework of the SampleOrchestrator project. We assembled for this project a set of tools allowing for the interactive real-time synthesis of automatically analysed and annotated audio files. Rather than a specific technique we present a set of components that support a variety of different interactive real-time audio processing approaches s...
متن کاملA General Filter Design Language with Real-time Parameter Control in Pd, Max/MSP, and jMax
Most signal processing environments for computer music, such as Pd, Max/MSP, and jMax, transfer audio data among their objects by vectors (blocks). In such environments, to implement Infinite Impulse Response (IIR) filters one either has to set the block-size to 1 or to write an external object which embeds the filter operations. Neither of these solutions are simple or trivial. In this paper w...
متن کاملReal-time Beat-synchronous Analysis of Musical Audio
In this paper we present a model for beat-synchronous analysis of musical audio signals. Introducing a real-time beat tracking model with performance comparable to offline techniques, we discuss its application to the analysis of musical performances segmented by beat. We discuss the various design choices for beat-synchronous analysis and their implications for real-time implementations before...
متن کاملReal - time audio analysis tools for Pd and MSPMiller
Two \objects," which run under Max/MSP or Pd, do diierent kinds of real-time analysis of musical sounds. Fiddle is a monophonic or polyphonic maximum-likelihood pitch detector similar to Rabiner's, which can also be used to obtain a raw list of a signal's sinusoidal components. Bonk does a bounded-Q analysis of an incoming sound to detect onsets of percussion instruments in a way which outperfo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998